A Semi-Supervised Method for Arabic Word Sense Disambiguation Using a Weighted Directed Graph

نویسندگان

  • Laroussi Merhbene
  • Anis Zouaghi
  • Mounir Zrigui
چکیده

In this paper, we propose a new semisupervised approach for Arabic word sense disambiguation. Using the corpus and Arabic Wordnet, we define a method to cluster the sentences containing ambiguous words. For each sense, we generate a cluster that we use to construct a semantic tree. Furthermore, we construct a weighted directed graph by matching the tree of the original sentence with semantic trees of each sense candidate. To find the correct sense, we use a similarity score based on three collocation measures that will be classified using a novel voting procedure. The proposed method gives a high rate of recall and precision.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Merging Word Senses

WordNet, a widely used sense inventory for Word Sense Disambiguation(WSD), is often too fine-grained for many Natural Language applications because of its narrow sense distinctions. We present a semi-supervised approach to learn similarity between WordNet synsets using a graph based recursive similarity definition. We seed our framework with sense similarities of all the word-sense pairs, learn...

متن کامل

Using the Mutual k-Nearest Neighbor Graphs for Semi-supervised Classification on Natural Language Data

The first step in graph-based semi-supervised classification is to construct a graph from input data. While the k-nearest neighbor graphs have been the de facto standard method of graph construction, this paper advocates using the less well-known mutual k-nearest neighbor graphs for high-dimensional natural language data. To compare the performance of these two graph construction methods, we ru...

متن کامل

Data-Driven Graph Construction for Semi-Supervised Graph-Based Learning in NLP

Graph-based semi-supervised learning has recently emerged as a promising approach to data-sparse learning problems in natural language processing. All graph-based algorithms rely on a graph that jointly represents labeled and unlabeled data points. The problem of how to best construct this graph remains largely unsolved. In this paper we introduce a data-driven method that optimizes the represe...

متن کامل

Semi-Supervised Word Sense Disambiguation Using Word Embeddings in General and Specific Domains

One of the weaknesses of current supervised word sense disambiguation (WSD) systems is that they only treat a word as a discrete entity. However, a continuous-space representation of words (word embeddings) can provide valuable information and thus improve generalization accuracy. Since word embeddings are typically obtained from unlabeled data using unsupervised methods, this method can be see...

متن کامل

On Robustness and Domain Adaptation using SVD for Word Sense Disambiguation

In this paper we explore robustness and domain adaptation issues for Word Sense Disambiguation (WSD) using Singular Value Decomposition (SVD) and unlabeled data. We focus on the semi-supervised domain adaptation scenario, where we train on the source corpus and test on the target corpus, and try to improve results using unlabeled data. Our method yields up to 16.3% error reduction compared to s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013